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Abstract 

Background: Enterococcus mundtii is a yellow-pigmented microorganism rarely found in human infections. The 
draft genome sequence of £ mundtii was recently announced. Its genome encodes at least 2,589 genes and 57 
RNAs, and 4 putative genomic islands have been detected. The objective of this study was to compare the genetic 
content of £ mundtii with respect to other enterococcal species and, more specifically, to identify genes coding for 
putative virulence traits present in enterococcal opportunistic pathogens. 

Results: An in-depth mining of the annotated genome was performed in order to uncover the unique properties 
of this microorganism, which allowed us to detect a gene encoding the antimicrobial peptide mundticin among 
other relevant features. Moreover, in this study a comparative genomic analysis against commensal and pathogenic 
enterococcal species, for which genomic sequences have been released, was conducted for the first time. Furthermore, 
our study reveals significant similarities in gene content between this environmental isolate and the selected enterococci 
strains (sharing an "enterococcal gene core" of 805 CDS), which contributes to understand the persistence of this genus 
in different niches and also improves our knowledge about the genetics of this diverse group of microorganisms that 
includes environmental, commensal and opportunistic pathogens. 

Conclusion: Although £ mundtii CRL1656 is phylogenetically closer to £ faecium, frequently responsible of nosocomial 
infections, this strain does not encode the most relevant relevant virulence factors found in the enterococcal clinical 
isolates and bioinformatic predictions indicate that it possesses the lowest number of putative pathogenic genes among 
the most representative enterococcal species. Accordingly, infection assays using the Galleria mellonella model confirmed 
its low virulence. 
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Background 

The genus Enterococcus is a diverse group of low GC% 
gram-positive bacteria that contains over 33 species. The 
genera include commensal species of the gastrointestinal 
tracts of humans and animals, and also environmental 
strains that can be isolated from soil, surface waters and 
plant material [1,2]. Enterococci are nutritionally fastidi- 
ous microorganisms, which are associated with a large 
variety of human activities. In this sense, several strains 



* Correspondence: magni@ibr-conicet.gov.ar 

'institute de Biologi'a Molecular y Celular de Rosario (IBR-CONICET) and 
Departamento de Microbiologia, Facultad de Ciencias Bioquimicas y 
Farmaceuticas, Universidad Nacional de Rosario, Suipacha 531, Rosario 
S2002LRK, Argentina 

Full list of author information is available at the end of the article 

Bio Med Central 



have technological relevance since they are present in 
dairy, meat and other fermented foods, and some of 
them show probiotic effects [1,2]. Nevertheless, they are 
not "generally recognized as safe" (GRAS) microorganisms 
for human consumption [U.S. Food Drug Administration 
(FDA) or European Food Safety Authority (EFSA)], even 
though they are phylogenetically related to the group of 
lactic acid bacteria (LAB). Particularly during the last dec- 
ade, Enterococcus faecalis and Enterococcus faecium strains 
emerged as opportunistic human pathogens frequently as- 
sociated with nosocomial infections with a high capacity 
to disseminate antibiotic resistance [3]. As a consequence, 
information on genetics and physiology of these species 
has increased dramatically in recent years; however, little 
data is available regarding other enterococci. 
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Enterococcus mundtii was described as a non-motile, 
yellow-pigmented enterococcus typically isolated from 
plant material, soil, cow teats and milker's hands [4,5] and 
infrequently associated to human infection [6] . As the rest 
of enterococci, they are facultative anaerobes and display a 
homolactic glucose metabolism. DNA GC content ranges 
from 38 to 39%, as determined by the melting temperature 
method [4]. We have recently announced the draft gen- 
ome sequence of E. mundtii strain CRL1656 [7]. This 
strain was isolated from stripping milk of an Argentinean 
cow and, given its bacteriocinogenic capacity, it has been 
proposed as a probiotic microorganism to prevent mastitis 
in these mammals [8] . 

During the last decade, the availability of bioinformat- 
ics approaches for comparison of multiple bacterial 
genomes allowed the analysis of a huge amount of se- 
quence data. In this sense, the aim of the present study 
was to gain insight into the similarities between the gen- 
etic content of E. mundtii and other enterococcal species 
and, more specifically, to investigate genes coding for 
virulence factors shared with E. faecalis and E faecium 
species, frequently behaving as opportunistic pathogens. 

Methods 

General inspection of the E. mundtii genome 

Gene products putatively encoded by E. mundtii CRL1656 
were identified using RAST [9] (GenBank accesion 
number: AFWZ00000000.1). In order to improve this 
sequence, a BLASTN approach (all versus all) was per- 
formed and those contigs shorter than 1,000 bp and with 
an homology higher than 99% with sequences already con- 
tained in a longer contig were deleted. In this manner, 87 
contigs were removed from the database (approximately 
200 Kb in total), resulting in a genome of 2.87 Mb (GC 
content 38.4%). The remaining contigs were ordered 
and oriented with Advanced Pipmaker using E. faecalis 
V583 genome as a reference [10]. Finally, contigs were 
concatenated, using a Perl script designed ad hoc, by 
including the sequence NNNNNCACACACTTAATTA 
ATTAAGTGTGTGNNNNN, which harbors stop codons 
in all six reading frames [11]. 

After annotation of each of the draft genomic se- 
quences in RAST (Table 1), as explained for E. mundtii, 
comparative genome analysis against other enterococcal 
species was performed. E. mundtii unique genes were 
determined by using RAST Compare Metabolic Recon- 
struction Tool [9]. 

ANI versus shared genes with similar function plot 

ANI values were calculated based on pairwise alignment 
of genome stretches using the JSpecies software with 
BLAST algorithm [17]. Calculation of ANI values is im- 
plemented as described by Goris J, et al. [18]. Shared 
genes with similar function among each enterococcal 



species described in Table 1 and E. mundtii were assessed 
using the Compare Metabolic Reconstruction tool from 
RAST server. Genes which were associated within a RAST 
subsystem for both microorganisms were considered as 
shared between them [9] . 

Study of genomic islands 

The analysis considered to detect putative Genome 
Islands (GEIs) was as follows. First, the regions detected 
by Alien-Hunter [19] were selected and secondly, posi- 
tives results obtained either by Colombo/SIGI-HMM 
[20] or IslandPath-DIMOB [21] were taken as require- 
ment for the next step. Regions identified using this ap- 
proach were manually inspected using Artemis and 
DNAplotter [22] in order to find out further evidences 
related to GEIs, in general, and Pathogenic Island (PAIs), 
in particular [23]. A deviation in the G + C content fre- 
quency (calculated by DNAPlotter) plus the presence of 
insertion sequences and tRNA flanking regions jointly 
with transposases coding genes, which are important for 
DNA incorporation processes, were accepted as further 
evidence of the presence of a GEL Lastly, specific features 
of each gene located within the putative PAI were deter- 
mined. Particularly, genes involved in virulence, antibiotic 
resistance, and pathogenic mechanisms were traced, using 
the BLASTP tool in the Virulence Factors Data Base ser- 
ver [24]. Others functions of the features found in these 
regions were extracted from RAST annotations. 

Phylogenetic tree construction 

This analysis involved sequences from nine enterococci 
species (Table 1) and the outgroup Lactococcus lactis 
SK11. Orthologous proteins were assigned using the 
OrthoMCL software [25]. In case that a particular 
enterococcal species contained more than one protein 
from the same group of orthologs, only the protein with 
the lower e-value was considered for the alignment. 
Concatenated orthologous protein sequences were 
aligned using ClustalX [26], and poorly aligned positions 
as well as excessively divergent regions were trimmed 
using GBlock 0.91b [27] resulting in a final alignment 
containing a total of 206,745 residues. Finally, the evolu- 
tionary history was inferred by using the Randomized 
Axelerated Maximum Likelihood algorithm (RAxML, 
[28]). DCMUT with empirical base frequencies and 
GAMMA distribution were used as substitution model. 
Reliability of the inferred tree was tested by bootstrap- 
ping with 1000 replicates. 

Analysis of regulators 

The presence of orthologs for putative regulator proteins 
of E. mundtii CRL1656 in E. faecalis V583 and E. fae- 
cium DO genomes were determined as described in 
Additional file 1: Figure SI. Briefly, all protein sequences 



Table 1 Enterococcal species analyzed in this study and relevant features of their genomic sequences 



Organism BioProject Status Size GC Genes Proteins Proteins Putative pathogenicity Source References 

(Mbp) % (PubMed) (RAST) determinants 3 



£ mundtii CRL1656 


PRJNA71221 


Scaffolds or 
contigs 


2.87 


38.4 


2,646 


2,589 


2,589 


659 


Cow udder 


[7] 


£ faecalis V583 


PRJNA57669, PRJNA70 


Complete 


3.36 


37.4 


3,412 


3,264 


3,363 


1,006 


Clinical 


[10] 


£ faecalis 62 


PRJNA1 59663, 
PRJNA61 185 


Complete 


3.13 


37.4 


3,157 


3,075 


3,075 


876 


Commensal 


[12] 


£ faecium XT16 (DO) 


PRJNA55353, 
PRJNA30627 


Complete 


3.05 


37.9 


3,209 


3,114 


2,779 


728 


Clinical 


[13] 


£ faecium Com 15 


PRJNA55725, 
PRJNA32967 


Scaffolds or 
contigs 


2.77 


38.2 


2,783 


2,724 


2,773 


755 


Commensal 


[14] 


£ italicus DSM 1 5952 


PRJNA61487, 
PRJNA53039 


Scaffolds or 
contigs 


2.31 


39.2 


2,455 


2,405 


2,275 


700 


Cheese 


[15] 


£ casseliflavus ATCC 
12755 


PRJNA63559, 
PRJNA53041 


Scaffolds or 
contigs 


3.55 


42.4 


3,606 


3,548 


3,415 


942 


Oral 
commensal 


[14] 


£ gallinarum EG2 


PRJNA55685, 
PRJNA32927 


Scaffolds or 
contigs 


3.13 


40.6 


3,079 


3,041 


3,072 


862 


Clinical 


[16] 


£ saccharolyticus 
ATCC 43076 


PRJNA206365, 
PRJNA191890 


Scaffolds or 
contigs 


2.60 


36.9 


2,625 


2,582 


2,594 


780 


Straw 
bedding 


[15] 



'According to predictions made using the Virulence Factors Data Base (see Methods). 
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assigned as regulator proteins by RAST were used as 
query in a BLASTP search over E. faecalis V583 and 
E. faecium DO genome sequences. Cut-Off and Cut-In 
values were established as described in Additional file 1: 
Figure SIC. Then, E. faecalis V583 and E. faecium DO 
proteins sharing an identity percentage with E. mundtii 
CRL1656 proteins higher than the Cut-in value were 
considered as present in the E. mundtii genome while 
those with lower identity percentages were considered 
absent. Proteins with shared identity between Cut-In 
and Cut-Off values were further individually analyzed 
according to their gene context (Additional file 1: Figure 
SID). For those E. mundtii CRL1656 proteins showing 
no homologs in E. faecium or E. faecalis a BLASTP was 
performed against a non-redundant database (RefSeq 
and Genbank) with a bit-score cut-off of 60. 

Semiqualitative comparison of potential virulence factors 

For the enterococcal species described in Table 1, BLASTP 
analysis of their respective proteomes was made against 
the Virulence Factors Data Base [24]. The expectation 
value used as cut-off was 10 _1 . Only the first hit was re- 
covered for each protein query. 

Galleria mellonella killing assay 

Infection of Galleria mellonella larvae with E. faecalis 
was accomplished as previously described by Lebreton 
et al. [29]. Briefly, using a syringe pump (microliter 
#750, Hamilton), larvae (about 0.2 g and 3 cm in length) 
were infected subcutaneously with washed E. faecalis 
JH2-2 or E. mundtii CRL 1656 cells from an exponential 
culture in LBG administered in 5 ul of sterile saline 
buffer. Control groups of larvae received 5 ul of a saline 
solution only. In each test, 15 larvae were infected and 
the experiments were repeated at least three times. Lar- 
val killing was monitored up to 89 hours post-infection. 
Survival curves were constructed by the Kaplan-Meier 
method and compared by Log-rank analysis (R statistical 
software). P values of <0.05 were considered statistically 
significant [29]. 

Results 

General features of the E. mundtii genome and 
phylogenetic analysis 

The genome sequence of E. mundtii strain CRL1656 was 
automatically annotated by using the RAST server [9] . A 
total of 2,589 coding sequences (CDS) and 57 structural 
RNAs (52 tRNAs) were predicted by this method. Puta- 
tive biological roles have been assigned for 1,724 (67%) 
of the ORFs, whereas the remaining 865 (33%) encode 
hypothetical proteins for which no probable function 
could be predicted. The general features of the genome 
are summarized in Figure 1 and Table 1. 



E. mundtii CRL1656 phylogeny was analyzed with a 
tree generated from the concatenated sequences of 805 
core proteins from nine enterococci and the outgroup 
L. lactis SK11 (Figure 2A). In this analysis we included 
representative enterococcal species of diverse origins, 
whose genomic sequences have been released (Table 1). 
We observed that E. mundtii is more related to E. fae- 
cium, which is frequently associated with nosocomial 
infections. 

To further analyze the relationships among these en- 
terococcal species, we constructed a plot of the shared 
genes with similar function between E. mundtii and 
the species under comparison versus average nucleotide 
identity (ANI) values (Figure 2B). ANI values are fre- 
quently used to verify prokaryotic species definitions 
[18,30]. As expected, these values were lower than the 
94% accepted as a threshold for species designation 
[18,30]. E. faecium strains showed the highest ANI 
values (around 76%), with a content of shared genes with 
similar function between 44 and 46%. With respect to 
other enterococcal species, ANI varies within a narrow 
range (approximately 71%) and E. mundtii shared more 
genes with similar function with strains of non-clinical 
origin as E. casseliflavus ATCC12755 and E. saccharoly- 
ticus ATCC 43076 than the rest of the analyzed strains. 

We were also interested in performing a bioinformatic 
quantification of genes horizontally transferred to the E. 
mundtii genome. According to the Alien-Hunter soft- 
ware, 39% of the genome was acquired by horizontal 
transfer. A list of genes coding for integrases, transpo- 
sases and phage-related proteins found by the RAST 
annotation is given in Additional file 2: Table SI. In this 
regard, four gene clusters that fit the criteria as GEIs 
(refer to Methods for details) were detected and are 
pointed out in Figure 1. GEI I contains a putative cell 
wall surface anchor protein, that could be involved in 
adhesion and invasion mechanisms. Interestingly, a gene 
for the bacteriocin mundticin was detected in the GEI 
III cluster, which is genetically linked to its immunity 
protein and transporter. 

Regarding genetically distinctive features of E. mundtii, a 
comparative analysis performed between strain CRL1656 
and the other enterococcal species included in this study 
(Table 1) revealed a set of 22 unique proteins with putative 
functions for this microorganism (Table 2). Among this 
group of proteins, besides the mundticin previously men- 
tioned, two proteins that could be responsible for capsule 
biosynthesis (see Virulence factors and antibiotic resistance 
section) and a toxin-antitoxin MazE-MazF system were 
detected. 

Identification of a gene encoding a putative bacteriocin 

A cluster of three genes involved in bacteriocin synthesis 
and transport, which is unique to this microorganism, 
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Genes with putative function located 
in each of the predicted Genomic Islands (GEI) 

GEII: 

Cell wall surface protein 2C 
GEI II: 

LSU ribosomal protein L35p 
LSU ribosomal protein L20p 
Retron-type RNA-directed DNA polymerase 
Phage integrase 
Resolvase/integrase 
Major Facilitator Superfamily 

GEI III: 

Rcsolvase like protein 
Recombinase sin 
Piscicolin 126 immunity protein 
Bacterioein mundticin 

Xanthine/uracil/thiamine/ascorbate permease 
Cation-transporting ATPase 

GEI IV: 

SSU ribosomal S9p 
DNA-cytosine methyltransferase 
DNA-cytosine methyltransferase 

Figure 1 Circular representation of the £ mundtii genome. Starting from the outer circle: Rings 1 and 2 (ligth-blue) indicate positions of 
protein-coding genes on positive and negative strands, respectively. Ring 3 (red) shows presumably foreign DNA as predicted by Alien Hunter. 
Ring 4 (grey) shows alien genes predicted by Columbo Sigi. Ring 5 (green) corresponds to IslandPath-DIMOB predictions. Ring 6 (black) shows 
positions of tRNA genes. Ring 7 (blue), putative tranposases. Ring 8 (ochre), integrases. Ring 9 represents the G + C percentage, colored yellow 
for regions above median GC score (38%) and violet for regions less than or equal to the median. Circular sectors I - IV highlight the position of 
putative genomic islands. A list of genes with putative functions in each GEI is presented. 




was found. These genes were designated munA, munB, and 
munC. The structural gene munA encodes a 58-aminoacid 
mundticin precursor. The mature peptide is a class Ha 
mundticin with an estimated molecular weight of 4,289 Da 
and pi of 9.45. The BLASTP protein database homology 
search on the 43-amino acid deduced mature peptide, 
mundticin 1656, revealed that its sequence is identical to 
that of the plasmid encoded-mundticin KS from E. mundtii 
NFRI7393 [31] and a 95% identical to that of E mundtii 
AT06 [32]. The pre-peptide contains a leader peptide of 15 



amino acids with a consensus GGXaa processing site. The 
mature peptide contains the YGNGV motif at positions 
3-7, characteristic of class Ha bacteriocins. Moreover, it 
contains the two cysteine residues (C-9 and C-14) forming 
the disulfide bridge, which are well conserved in all class 
Ha bacteriocins [33]. 

The munB gene encodes a 674-amino acid polypep- 
tide, which exhibits 98% of similarity with the 674- 
amino acid ABC transporter MunB found in E. mundtii 
NFRI7393, involved in the maturation and excretion of 



100 j 



£ fee V583 
£ fee 62 
E. mundtii 



JM-I 

|_r E. fm DO 

ioIP-e. frr?Com15 



too j 



' 00 



E. italicus 

- E. saccharolyticus 



— E. casseliflavus 
too ' — E. gallinarum 
L lactis SK11 



3 47% 

is 

g 45% 

£ 43% 
I 



1 



E. casseliflavus, E. saccharolyticus 


E. gallinarum* 




E. fee V583 # 


£. fm DO 

• 


E. fee 62* 




E. italicus* 


E. fm Com 15 


.L./acfe SK11 





64% 67% 70% 73% 76% 

AN I 

Figure 2 Phylogenetic analysis of E. mundtii CRL1656. (A) Core gene tree. ClustalX aligned sequences of core concatenated proteins were 
used for the phylogeny reconstruction analysis of enterococci species using the Randomized Axelerated Maximum Likelihood (RAxML) algorithm 
based on DCMUT with empirical base frequencies and GAMMA distribution model. The reliability of the inferred tree was tested by bootstrapping 
with 1000 replicates. The tree with the highest log likelihood (-2038246) is shown. Lactococcus lactis SK1 1 was included as outgroup species. 
(B) ANI plot. Shared genes with similar function and ANI values between E. mundtii and the bacterial genomes of the indicated strains are plot. 
£. mundtii CRL1 656 (£. mundtii), L lactis subsp. cremoris SKI 1 (L lactis), E. italicus DSM15952 (£ italicus), E. casseliflavus ATCC12755 (£ casseliflavus), 
E. faecalis 62 (£ fae 62), £ faecalis V583 (£ fae V583), £ saccharolyticus ATCC 43076 (£ saccharolyticus), £ gallinarum EG2 (£. gallinarum), £ faecium 
Com15 (£ fm Com15), and £ faecium DO (£ fm DO). The data set supporting the results of this article is available in the TreeBASE repository, 
[http://purl.Org/phylo/treebase/phylows/study/TB2:S15854]. 
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Table 2 Unique genes among enterococcal species found in the E. mundtii genome 



Feature ID 


Start 


Stop 


Length (AA) 


Putative function GEI 


fig|6666666.25220.peg.49 


39017 


39349 


111 


Putative EsaC protein analog (Listeria type 3) 


fig|6666666.25220.peg.389 


768326 


768559 


78 


Programmed cell death antitoxin MazE 


fig|6666666.25220.peg.401 


780662 


782068 


469 


Retron-type RNA-directed DNA polymerase (EC 2.7.7.49) 


fig|6666666.25220.peg.507 


875260 


875538 


93 


Zinc finger domain-containing protein 


fig|6666666.25220.peg.523 


889680 


889808 


43 


Probable GTPase related to EngC 


fig|6666666.25220.peg,751 


1130121 


1131056 


312 


Mtx2 


fig|6666666.25220.peg,781 


1159802 


1160566 


255 


Hnh endonuclease 


fig|6666666.25220.peg.928 


1322901 


1 323290 


130 


Holin 


fig|6666666.25220.peg.1128 


1488458 


1487922 


179 


Rhs-family protein 


fig|6666666.25220.peg,1 403 


1721023 


1720727 


99 


putative piscicolin 126 immunity protein III 


fig|6666666.25220.peg.1 405 


1723390 


1723214 


59 


Bacteriocin mundticin III 


fig|6666666.25220.peg.1 456 


1777938 


1778441 


168 


Phosphoribosylanthranilate isomerase like (EC 5.3.1.24) 


fig|6666666.25220.peg.1618 


1965831 


1 965595 


79 


COG0477: Permeases of the major facilitator superfamily 


fig|6666666.25220.peg,1 862 


2217937 


2219130 


398 


ATPase 


fig|6666666.25220.peg,1 907 


243734 


245122 


463 


endo-beta-galactosidase, GlcNAc-alpha-1,4-Gal-releasing 


fig|6666666.25220.peg.1961 


2301444 


2302463 


340 


PE-PGRS family protein 


fig|6666666.25220.peg.2071 


2406250 


2405168 


361 


capsular polysaccharide biosynthesis protein Cps4G 


fig|6666666.25220.peg.2073 


2408118 


2407363 


252 


putative capsule biosynthesis protein 


fig|6666666.25220.peg.2156 


2483290 


2482904 


129 


Putative EsaC protein analog (Listeria type 3) 


fig|6666666.25220.peg.2203 


2515077 


2514832 


82 


SMT0609 replicon stabilization protein (antitoxin to SMT0608) 


fig|6666666.25220.peg.2352 


2693714 


2693941 


76 


COG0477: Permeases of the major facilitator superfamily 


fig|6666666.25220.peg.2403 


2758856 


2757387 


490 


Internalin-like/N-acetylmuramoyl-L-alanine amidase 



enterocin KS [31]. The munC gene encodes a mundticin 
immunity protein of 98 amino acids, as evidenced by the 
close similarity to the immunity protein of mundticin 
KS [31]. 

Stress response systems 

One of the most relevant characteristics of the Entero- 
coccus genus is its capacity to resist and grow at low pH. 
Thus, we analyzed the presence of genes involved in the 
acidic stress response in the E. mundtii genome. We 
identified eight genes encoding the different subunits of 
the F 0 Fi-ATPase, which is also present in the enterococ- 
cal core genome previously described. This complex is 
involved in pH homeostasis through the generation of 
proton motive force by ATP consumption [34]. The 
organization of these genes was the same as that 
for other enterococcal F 0 Fi-ATPases (atpBEFHAGDC), 
where atpBEF encode the A, C and B subunits of the 
membrane-bound F 0 domain, whereas atpHAGDC genes 
encode the S, a, y, (3 y e subunits of the cytoplasmic 
Fi domain, respectively [34]. In addition, the genome of 
E. mundtii possesses an operon constituted by nine 
genes (F, I, K, E, C, G, A, B and D) homologous to the 
ntp genes of E. hirae [35] encoding a Na + -pumping 
ViVo-ATPase. The Vi domain is a peripheral complex 



responsible for ATP hydrolysis, and is constituted by A, 
B, C, D, E, F and G subunits. The Vo domain is an inte- 
gral complex responsible for Na + translocation across 
the membrane. 

Furthermore, three gene clusters also indirectly in- 
volved in resistance to low pH were identified. One of 
them corresponds to the cit genes, homologous to those 
recently characterized in E faecalis, which are respon- 
sible for the degradation of citrate and are organized in 
two divergent putative operons [36-39]. The first tran- 
scriptional unit contains genes encoding a GntR-family 
transcriptional regulator and a citrate permease belong- 
ing to the Me 2+ -CitMHS family (citO and citH genes, re- 
spectively). In the second operon, genes responsible for 
the codification of citrate lyase subunits (citDEF), citrate 
lyase accessory genes (citC and citX) and a membrane 
oxaloacetate decarboxylase {oadAHBD) were found. This 
genetic organization resembles that found in E. faecium, 
which differs from the E. faecalis cit locus in the position 
of another citrate lyase-accesory gene, citG, and also in 
the absence of the soluble oxaloacetate decarboxylase 
gene, citM. The second group of genes involves the mleS 
(malolactic enzyme), mleT (malate transporter) and mleR 
(transcriptional activator from LysR family) genes, which 
are required for the malolactic fermentation and show 
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homology to those found in E. faecium [40] . Finally, genes 
associated to the arginine deiminase pathway (ADI), which 
generates ATP and contributes to low-pH resistance, were 
identified. These genes are present in the core genome of 
selected enterococci and code for the three main enzymes 
of the system, arginine deiminase, ornithine transcarbamy- 
lase and carbamate kinase. Additionally, we identified a 
homologue to arcR divergently oriented, which encodes 
an activator of the ADI system (Crp-Fnr family of regula- 
tors). Besides, a homologue of the arginine-ornithine 
transporter (arcD) of Lactococcus garvieae ATCC 49156 
was identified in another genomic region of E. mundtii. 

Other related proteins required for optimal stress resist- 
ance were identified in the genome of E muntdii: RecA, a 
mediator of homologous recombination and regulator of 
the SOS response, chaperonins GroEL and GroES, HtrA, 
a protein involved in proteolysis of abnormal proteins syn- 
thesized under stressful conditions, the enzymes involved 
in the synthesis of D-alanyl-lipoteichoic acid (dlt operon, 
see Table 3), and the diacylglycerol kinase DagK, involved 
in acid resistance [34]. 

Table 3 Putative virulence factors present in E. mundtii 



Virulence factor Genes Homologous locus Role in pathogenesis Reference 



Adhesin 










Pili 


scm 


EfaeDRAFT_0418 


Adherence 


[41] 




ebpABC 


EF1091-EF1093 


Biofilm formation 


[42] 




ebpR 


EF1090 


ebp locus regulation 


[43] 




strC 


EF0194 


Pilus Sortase C 


[42] 




strA 


EF3056 


Pilus Sortase A 


[44] 




rnjB 


EF1185 


ebp locus regulation 


[45] 


Capsule 












eps locus 


EFSG_00424, -25, -26, -31, -33, -35 to -41 


Resistance to phagocytosis 


[16] 


Cell wall 












galU 


EF1746 


Resistance to multiple types of stress 


[46] 




bgsAB 


EF2891- EF2890 


Adherence and biofilm formation 


[47] 




dlt locus 


EF2749-EF2746 


Biofilm formation 


[46] 




bopD 


EF0954 (MaIR) 


Biofilm formation 


[48] 




epa locus 


EFWG_01 395-EFWG_01 370 


Biofilm and resistance to PMNs 


[49] 




sagA 


EfaeDRAFT_1606 


Adherence 


[50] 


Others 












efafm 


EfaeDRAFT_2037 


Adherence 


[43] 




msrA 


EF1681 


Oxidative stress 


[51] 




msrB 


EF3164 


Oxidative stress 


[51] 




gls33 


EfaeDRAFT_2384 


Stress response protein 


[52] 




gls20 


EfaeDRAFT_2389 


Stress response protein 


[52] 




sigVl and sigV2 


EF3180 


Mouse bacteremia 


[53] 




eep 


EF2380 


Biofilm formation 


[54] 



Analysis of regulators 

Since adaptation to different niches requires a fine- 
tuning of gene expression, we searched for regulators re- 
sponsible for sensing the bacterial environmental milieu 
and physiological state. By knowing the set of regulators 
shared by related bacteria and detecting the presence of 
those that are unique, it is possible to infer the common 
and specific characteristics of niches and lifestyle among 
them. In the attempt to study the complete set of regula- 
tors encoded by the E. mundtii genome, a database 
consisting of 122 genes was constructed (see Methods 
section) and a comparative analysis against E. faecalis 
V583 and E. faecium DO was performed. 

Core regulators 

58 (48%) of the analyzed genes were found in the three 
genomes (Additional file 1: Figure SI). Although 24 of 
these genes are not assigned to any RAST subsystems, 
the rest of them were implied in diverse functions in- 
cluding sugar uptake and utilization (mannose, lactose, 
galactose, D-tagatose and galactitol), anabolic pathways 
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(purine, pyrimidine, deoxyribonucleotides, glutamine, glu- 
tamate, aspartate and asparagine, fatty acids and pyridoxin), 
amine compounds degradation (arginine, ornithine, poly- 
amine, chitin and N-acetylglucosamine), ATP-dependent 
proteolysis, phosphate uptake, competence, heme and he- 
min uptake and utilization systems, metal homeostasis/re- 
sistance (copper, cobalt, zinc and cadmium) and oxidative 
stress (Additional file 3: Table S2, Core regulators). 

Non-core regulators 

37 genes (30%) of the analyzed regulators were found in 
E. mundtii and E. faecium (20%; 25 genes) or E. faecalis 
(10%; 12 genes) but not in the three genomes at the 
same time (Additional file 1: Figure SI). Although higher 
cut-in and cut-off values were used for ortholog assign- 
ment to E. faecium with respect to E. faecalis (refer to 
Methods section), higher numbers of orthologs were 
found in the former, which correlates with the closer 
phylogenetic relationship between the species. While 25 
out of the 37 genes are not assigned to any RAST subsys- 
tems, the remaining 12 were assigned to diverse functions 
including sugar uptake and utilization (fructooligosaccha- 
rides, raffinose, lactose and galactose), heme and hemin 
uptake and utilization systems, degradation of amine com- 
pounds (arginine and ornithine), organic acid metabolism 
(pyruvate and sialic acid) and oxidative stress (Additional 
file 3: Table S2. Non-core regulators). 

E. mundtii species-specific regulators 

27 gene orthologs (22%) could not be found in E. faecium 
or E. faecalis, and 12 (10%) of these genes have orthologs 
in other LAB. The remaining genes only have orthologs in 
phylogenetically more distant bacteria (Bacillus, Clostrid- 
ium, Listeria). Although none of these genes belong to a 
RAST subsystem, a preliminary sinteny analysis assigned 
them to functions related to carbon compound utilization 
(maltose, xylose, cellobiose and formaldehyde), anabolic 
pathways (sulfur containing compounds, coenzyme A, ly- 
sine), cell surface functions, ABC-type transporters and 
other roles (hydrolase, DNA methylation, ribosomal re- 
lated) (Additional file 3: Table S2). 

Among all the transcriptional regulators found in the 
E. mundtii genome, there is a subset corresponding to 
response regulators (RR) putatively forming part of two- 
component signal transduction systems (TCS). The im- 
portance of TCS for the ability of E. faecalis to respond 
to environmental stimuli has been previously stressed by 
Hancock and Perego [55]. The E. mundtii genome con- 
tains putative genes coding for 17 TCS and one orphan 
response regulator (Additional file 4: Table S3). We 
found E. faecalis homologues for 14 of the E. mundtii 
TCS and also for the orphan RR. As a general rule for 
most bacterial sensor kinases, all of the HK in E. mund- 
tii were predicted to be membrane localized (data not 



shown). Among the remaining HK, one (lib) showed 
homology to proteins present in environmental entero- 
cocci, and HKs Illa-i and j presented homology to sen- 
sors encoded by E. faecium. This analysis suggests that 
the set of TCS present in E. mundtii is highly similar to 
that encoded by other enterococcal species. 

Identification of genes involved in pigmentation 

E. mundtti CRL1656 shows a characteristic yellowish 
pigmentation on agar plates, originated by the production 
of carotenoids as reported for other Enterococcus species 
[56]. Genes responsible for carotenoid production have 
been previously described in Staphylococcus aureus [57] 
and Lactobacillus plantarum [58]. It has been shown that 
crtM and crtN constitute a bicistronic operon encoding 
the enzymes responsible for the production of the yellow 
pigment, presumably the C30 carotenoid 4,4'-diaponeur- 
osporene. In S. aureus, diaponeurosporene is further con- 
verted to staphyloxanthin, the orange carotenoid present 
in most staphylococci strains. For this, S. aureus harbors 
up to three extra enzymes coded by genes crtO, crtP and 
crtQ, which are located in the same operon as crtM and 
crtN (crtOPQMN) [57]. The E. mundtii genome encodes a 
protein with 68 and 77% similarity to CrtN from S. aureus 
and L. plantarum, respectively. Regarding CrtM, the hom- 
ology is reversed, with the homologue from L. plantarum 
sharing a similarity higher than the one corresponding to 
S. aureus (64 versus 53%, respectively). Remarkably, this 
analysis also detected the presence of putative homologues 
for S. aureus CrtO (60%), CrtP (73%) and CrtQ (52%). 
However, genes coding for these proteins are located in a 
different cluster with respect to crtN and crtM, and also 
show a different organization (crtPQO) to that found in 
S. aureus [57]. Furthermore, a comparative analysis with 
other enterococcal species indicated that crtN and crtM 
homologues were present in E. gallinarum, E. saccharoly- 
ticus and E casseliflavius genomes. Noteworthy, putative 
proteins involved in staphyloxanthin biosynthesis were 
also detected in E. casseliflavius. 

Virulence factors and antibiotic resistance 

The potential virulence genes encoded by the genome of 
E. mundtii are of particular interest. An initial compara- 
tive analysis between different enterococcal species using 
the Virulence Factors Data Base [24] revealed that E. 
mundtii has a reduced number of putative virulence de- 
terminants (Table 1). We searched for virulence genes 
previously analyzed by G. mellonella or mice infection 
models [3,46,59] to validate our results (Table 3). Among 
collagen adhesins, only the E. faecium widely spread scm 
gene [41] was detected in the E. mundtii genome, which 
lacks the most extensively studied ace and acm genes. 
We also found the ebp pili coding cluster, in conjunction 
with its cognate transcriptional regulator EbpR [43] 
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along with sortase proteins [42,44], which are ubiquitous 
in E. faecalis and E. faecium. Furthermore, we identified 
the rnjB gene, which encodes a putative RNase J2 that 
activates the transcriptional expression of the ebpABC 
operon [45]. Noteworthy, expression of the ebp locus is 
negatively regulated by the Fsr system [60], which is also 
present in E. mundtii (Additional file 4: Table S3). None 
of the other pili-encoding loci previously reported for 
E. faecalis {bee plasmidic genes, [61]) or E. faecium [41] 
were detected. A cluster of genes that might be respon- 
sible for the synthesis of capsular polysaccharides 
(eps locus, Additional file 5: Figure S2) was identified. 
It codes for proteins with homology to that reported in 
E. faecium [16]. In E. mundtii, we found three of the 
four phosphoregulatory system proteins conserved in all 
species of E. faecium (the LytR-Csp-Psr family protein is 
absent). Furthermore, two glycosyl transferases and the 
conserved dehydrogenase and flippase were detected as 
well as several conserved PTS-related proteins linked 
to the eps cluster in E. faecium 1,141,733 (Additional 
file 6: Table S4). 

Moreover, several virulence genes that are related to 
cell wall synthesis in E. faecalis were found (Table 3). 
Remarkably, genes responsible for the synthesis of Epa 
(enterococcal polysaccharide antigen, suggested to be as- 
sociated to the cell wall) were detected (Additional file 5: 
Figure S2). The core genes showed a similar genetic 
organization to that observed in E. faecium [16]. In 
E. mundtii the variable accessory region encode pre- 
dicted glycosyltransferases and other proteins with po- 
tential roles in WTA production (encoded by tag genes), 
and the genetic organization resembles that of E. fae- 
cium Coml5, a microorganism of human intestinal ori- 
gin (Additional file 6: Table S4). Since major growth 
defects have been detected for mutants in different 
genes of this locus, Rigottier et al. [46] have proposed 
that these genes are important for the fitness of the bac- 
terium rather than bona fide virulence factors. 

Other virulence genes detected correspond to msrA 
and msrB, encoding methionine sulfoxide reductases im- 
portant for the oxidative stress response, macrophage 
survival, and persistent infection [51]; and gls genes en- 
coding general stress proteins (Gls33 and Gls20), im- 
portant for adaptation to the intestinal environment and 
in mouse peritonitis models [52]. Moreover, this analysis 
detected the presence of a gene with significant hom- 
ology to bopD, coding for the transcriptional regulator 
of maltose metabolism, which has been implicated in 
biofilm formation and bacteremia in mice [48]. 

Regarding resistance to antibiotics, the search for 
vancomycin resistance genes (vanA, vanB, vanC, vanD 
and vanE), tetracycline (tetM, tetL, tetS, tetO, tetK, and 
tetW) gentamicin (aac {6')-aph (2')), chloramphenicol 
{cat), lincosamide (InuA), erythromycin (ermT and 



ermB), methicillin (mecA) and penicillin (blaZ) did not 
identify genes with significant homology (Additional file 7: 
Table S5). Furthermore, mutations conferring resistance 
to ampicillin or ciprofloxacin within the genes coding for 
gyrA or pbpS, respectively, were neither identified. 

E. mundtii effect on G. mellonella survival 

In order to study E. mundtii virulence, CRL1656 strain 
was used to infect the insect host model Galleria mello- 
nella [29,46]. G. mellonella, is a reliable model host to 
study the pathogenesis of numerous human pathogens 
[62] . The capacity of a pathogen to kill G. mellonella has 
a correlation with virulence in mammalian models [62]. 
In fact, the innate immune systems of Galleria larvae 
and mammals share a high degree of homology [63]. 

In our analysis, the E. faecalis JH2-2 strain was used as 
control [64]. This strain was initially isolated from a 
nosocomial infection, and constitutes a genetic model 
extensively used in our laboratory [36,38,40,65] and by 
others [66,67]. The semiquantitative analysis performed 
(as described in the methods section) on the predicted 
proteins of the JH2-2 genome revealed 874 hits for puta- 
tive virulence factors). As shown in Figure 3, G. mello- 
nella survival was significantly LogRank test, P < 0.05 
greater after infection with E. mundtii than with E. fae- 
calis JH2-2 strain, at two different enterococcal concen- 
trations. In the control groups infected with sterile 
saline buffered solution no larvae died in any of the rep- 
licates (data not shown). As consequence, these results 
confirm that E. mundtii is not as effective as E. faecalis 
in colonizing and killing G. mellonella larvae. 

Conclusion 

In this report, a study of the genomic data of E. mundtii 
CRL1656 is presented for the first time. Additionally, a 
comparative analysis including the full genomic se- 
quence of representative enterococcal species of diverse 
origins was conducted (Table 1). E. mundtii CRL1656 
contains 805 CDS in common with other species of En- 
terococcus and only a low number of unique CDS, 
among which we identified a cluster encoding a bacteri- 
ocin and other related proteins. A phylogenetic tree for 
concatenated sequences of enterococcal core proteins 
was constructed (Figure 2A), indicating that E. mundtii 
is closer to E. faecium. This is in accordance with the 
sodA gene sequence phylogenetic comparison reported 
by Poyart et al. [68], which showed that these two en- 
terococcal species cluster together. Moreover, as shown 
in Figure 2B, the highest ANI percentage was found be- 
tween these microorganisms. 

Bacterial genome plasticity is influenced by the pres- 
ence of GEIs, which may include genes encoding for any 
number of functions, notably pathogenicity determinants 
[3,46,59]. Only 4 putative GEIs were identified in strain 
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Survival Analysis Low CFU 



■ E. mundtii 6.50x10'CFU 

■ E. faecalis JH2-2 6.85x1 0'CFU 



Survival Analysis High CFU 




■ E. mundtii 1.31x10'CFU 

■ E. faecalis JH2-2 1 .35x10' CFU 



Time (h) Time (h) 

(PO.001) (P = 0.019) 

Figure 3 E mundtii effect on G. mellonella survival. Kaplan-Meier survival analysis of G. mellonella upon infection with £ mundtii (blue lines) and 
£ faecalis JH2-2 (red lines). Data are representative of three separate survival experiments performed with groups of 15 insects for each. Survival curves 
were constructed by the Kaplan-Meier method and compared by Log-rank analysis. P values of <0.05 were considered statistically significant. 



CRL1656, and none of them carries genes that might be 
encoding known virulence determinants. Many virulence 
genes and their functions have been described in entero- 
cocci [3,46,59]. Our bioinformatic analysis revealed the 
lowest number of putative pathogenicity factors for 
E. mundtii in comparison to other Enterococcus strains 
(Table 1), including different species from diverse origins 
(clinical, food and commensal). However the analysis of 
the presence or absence of virulence traits is not decisive 
to determine the commensal or pathogenic nature of a 
bacterium and the predictive methods have limitations 
and are not sufficient to conclude on the pathogenesis of 
the microorganism. This observation clearly arises from 
the analysis of Table 1, in which the commensal strain 
E. faecium Coml5 possesses a higher number of putative 
virulence determinants (755) than the clinical isolate 
E. faecium strain DO (728). In this study, the virulence 
of the E. mundtii CRL1665 strain was evaluated using 
the G. mellonela model. As shown in the Figure 3, 
E. mundtii CRL1665 resulted non-virulent at low dose 
whereas the JH2-2 strain used as control killed 60% of 
larvae at 60 h. At high dose similar results were found 
E. mundtii strain was able to kill only 40% of larvae, 
whereas the control killed near the 80%. Our data indi- 
cate that the E mundtii strain CRL1665 as it was pre- 
dicted by bioinformatic analysis is less virulent than the 
model E. faecalis JH2-2. 

Remarkably, neither homologues for the main secreted 
factors shown to be critical for enterococcal pathogen- 
esis, sprE (serine protease) and gelE (gelatinase), were 
found in E. mundtii nor other relevant genes such as cyl 
(transport and activation of a cytolisin [69]), agg (adher- 
ence to eukaryotic cells) and esp (cell wall protein in- 
volved in immune evasion [70]). Interestingly, a gene 



annotated as a putative hemolysin was found. This pro- 
tein contains a domain belonging to the hemolysin III fam- 
ily (Pfam 03006), which includes proteins from pathogenic 
and non-pathogenic bacteria, Homo sapiens and Drosoph- 
ila melanogaster. In Bacillus cereus, it has been shown to 
function as a channel-forming cytolysin. In contrast, the 
cytolytic hemolysins commonly expressed by many entero- 
coccal bacteraemia isolates, are two-component peptide 
systems (CylL L and CylL s ) whose expression requires the 
products of an eigth-gene locus [71]. However, this locus is 
absent in E. mundtii. Experimental assays will be needed to 
define if the type III-hemolysin has a lytic role in vivo. 

Some enterococcal species, particularly those thought 
to be associated with soil and non-human hosts [72], in- 
cluding E. mundtii, contain yellow pigments. Pigments 
in E. mundtii were identified as carotenoids [73]. This 
study has shown the presence of genes involved in the 
C30 carotenoid biosynthetic pathway in this bacterium, 
which is also present in most of the environmental en- 
terococcal sequenced species. The ubiquitous detection 
of crtM and crtN genes, involved in the biosynthesis of 
the yellow 4,4'-diaponeurosporene, as well as the docu- 
mented carotenoid production by these microorganisms 
[4], suggest that the role of carotenoids in enterococcal 
environmental fitness must be important. Carotenoids 
function as protectors against photodamage as they are 
able to quench ROS. Pigmented enterococci are pro- 
tected from solar inactivation and can persist for ex- 
tended periods of time in marine water relative to non- 
pigmented species [56]. Therefore, we can speculate that 
pigmentation in E. mundtii could play a protective role 
when the microorganism is exposed to the environment. 

In this work we also identified those genes encoding 
the machinery responsible for the production of a 
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bacteriocin previously reported by Espeche et al. [8]. 
This feature constitutes a relevant and unique character- 
istic of the strain. With respect to its biological activity, 
mundticin 1656 was shown to inhibit the growth of Lis- 
teria monocytogenes and a variety of lactic acid bacteria 
[8]. Noteworthy, the use of bacteriocins for the preven- 
tion or treatment of mastitis in cows has the potential 
to reduce the dependence on antibiotics. Given the gen- 
etic features observed throughout this work, E. mundtii 
could be used to this end, but this hypothesis needs to 
be further investigated. 
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Additional file 1: Figure SI. Search for orthologs of E mundtii CRL1656 
putative regulators in E. faecalis V583 and £ faecium DO genomes. A) 
General workflow. £ mundtii CRL1656 regulators were used as query in a 
BlastP search over £ faecalis V583 and £ faecium DO genomes. Obtained 
data was used to define Cut-in and Cut-Off values and subsequently to 
determine the presence or absence of each regulator in both genomes. 
B) Construction of a Correlation Table. Retrieved sequences from previous 
BlastP analysis with more than 80% coverage, score values higher than 
50 and sharing the highest degree of identity were directly added to the 
Table. C) Workflow for Cut-in and Cut-Off definition. £ faecalis V583 or £ 
faecium DO proteins correlating with more than one £ mundtii CRL1656 
protein were used as Input Data. Each set of £ mundtii CRL1656 proteins 
was sorted by percentage of identity (ld%). Those with the highest value 
and more than 50% of ld% were defined as orthologs and used for 
Cut-in definition. The remaining proteins were defined as paralogs and 
used to set up the Cut-Off value. Protein sets with the highest ld% but 
lower than 50% were analyzed individually to determine whether they 
were paralogs or orthologs. Finally, the lowest ld% value among all 
orthologs was defined as the Cut-in value and the highest ld% among 
all the paralogs was defined as Cut-Off value. D) Workflow for Ortholog 
Table construction. The pruned correlation table results from eliminating 
paralogs defined in (C) from the correlation table defined in (B). Al 
proteins that shared an ld% higher than the Cut-in value were considered 
as present and those with ld% lower than the Cut-Off were considered 
absent. Presence or absence of proteins with shared identity between 
Cut-Off and Cut-in values was analyzed individually by sinteny. 

Additional file 2: Table SI. Putative functions associated to horizontal 
gene transfer. 

Additional file 3: Table S2. Transcriptional regulators encoded in the 
£. mundtii genome. 

Additional file 4: Table S3. Two-component systems present in 
£ mundtii. 

Additional file 5: Figure S2. epa and eps gene clusters present in 
£ mundtti CRL1656. £ faecium 1.141.733 epa genes and £ faecium Com 
15 eps genes are shown for comparison. Genes are colored following 
the code included in the figure. 

Additional file 6: Table S4. Eps and epa loci in £ mundtii. 
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